Wikipedia:WikiProject Articles for creation/AfC Process Improvement May 2018
This page documents the development process for a completed project undertaken by the WMF Growth team from April to October 2018. |
The Wikimedia Foundation's Community Tech team and Growth teams are extending the New Pages Feed interface to allow both Articles for Creation (AfC) reviewers and New Page Patrol (NPP) reviewers to prioritize pages for review using quality and copyright violation scores. This work will take place during May and June 2018.
Summary
[edit]Articles for Creation (AfC) is a process in English Wikipedia by which experienced Wikipedians review draft pages to determine whether they can be added to the main article namespace. Because of the April 2018 policy change known as ACPERM, traffic to the AfC process is expected to increase. In discussion with the AfC community, it has been determined that AfC will be able to promote more high quality pages to the main article namespace more quickly through enhanced prioritization tools. Therefore, the Community Tech team is building an improvement to the existing New Pages Feed interface that will allow AfC reviewers to use predictive models to prioritize drafts by their likelihood of copyright violation and their predicted quality. This improvement will also be available to reviewers in the New Page Patrol (NPP) process, who review pages created directly in the main article space, and are the original users of the New Pages Feed.
Background
[edit]For the full background on the motivation of this project and corresponding discussion by members of the AfC and NPP communities, please see the original project page.
Goals and metrics
[edit]Goal: we want AfC reviewers to be able to get high quality articles into the main namespace as quickly as possible.
Given the expected increase of submissions to AfC, we plan to help by increasing the efficiency with which reviewers can review drafts while maintaining their quality standards. Specifically, there are three metrics that we intend to maintain or improve:
- Mainspace rate: what percent of submitted drafts are in the main article namespace after 60 days?
- Survival rate: what percent of articles that came from AfC are not nominated for deletion after 90 days?
- Quality article waiting period: for drafts that were accepted on their first review, how long did they have to wait for their first review?
Community discussion on these metrics is here.
Technical discussion on the appropriate way to create these numbers is in Phabricator here.
Solution
[edit]Through about a month of conversation on potential improvements, community members and WMF staff agreed that a way to help AfC reviewers pursue the goals and metrics above would be through an interface to allow reviewers to prioritize drafts based on quality scores, and would allow them to quickly assess copyright violations (copyvio). The New Pages Feed, which was built for the separate NPP process, is an existing interface that can facilitate this. The planned changes are as follows:
- New Pages Feed will be extended to include AfC drafts as a new list of pages for review, in addition to the existing "Article" and "User" pages that are currently available.
- The feed will be enhanced to allow prioritization by quality and copyvio scores for all pages in the interface.
- These new capabilities will be available to both the AfC and NPP reviewers.
Below is a set of user stories that describe in more detail what we plan to implement with these changes. User stories are a software engineering practice that help clearly define what software will need to accomplish from the perspective of the user. The user stories below are based on discussion with AfC and NPP reviewers, and are broken down into four categories. The Community Tech team hopes to address all these stories, but may need to omit stories that prove to be too technically challenging for the limited scope of this project, or that the community judges are not actually important.
In the future, this section will link to visual mockups built on these user stories, so that the community can comment on them before the software is built.
Sorting and filtering
[edit]As a reviewer, I need to sort and filter the New Pages Feed by prioritization information:
- I need to be able to filter by the four categories in the ORES draftquality model (vandalism, spam, attack, ok).
- I need to be able to filter by the six categories in the ORES wp10 model (Stub, Start, C-class, B-class, Good, Featured).
- I need to be able to filter to likely copyright violations, corresponding to a pre-set cutoff of the copyvio likelihood score.
- I need to be able to sort by the copyvio score.
- I need my filter and sorting settings to be sticky, so that I don’t have to reset them on every visit.
- I need to filter to only those drafts that have been submitted to AfC and are awaiting review. This would include drafts that are awaiting their second, third, etc. review, but it would exclude those drafts that have been submitted for review, have already been reviewed, and awaiting resubmission by their authors.
- I need to filter to all drafts (not only those submitted to AfC).
Listing
[edit]As a reviewer, I need to see prioritization information next to each page’s entry in the New Pages Feed:
- I need a page's draftquality category and wp10 category to be displayed.
- I need a page's copyvio score to be displayed.
- I need to click through to see more information about a copyright score, specifically the likely violating text and its source, similarly to how Earwig's Copyvio Detector and CopyPatrol currently work.
Selection
[edit]As a reviewer:
- I need to be able to select a random draft for review, so that drafts with middling scores don’t get excluded from review perpetually.
- I need to not accidentally attempt to review a draft already under review by another reviewer.
Other
[edit]As a reviewer:
- I need models to be up-to-date with the latest revision of a page at all times.
- I need copyvio scores to be up-to-date with the latest revision of a page at all times.
- I need to be able to easily learn more information about how copyvio and ORES scores are generated.
Execution plan and status
[edit]The Community Tech team started planning this project on May 1, 2018, with involvement from WMF designers and community liaisons. The next steps are as follows, and will contain a more specific timeline in the future:
1. Community Tech software engineers will investigate the feasibility and challenges of the user stories listed above. Specifically, those investigations are tracked in these three Phabricator tasks:
- Investigation: using ORES for page review prioritization
- Investigation: applying copyvio for page review prioritization
- Investigation: detecting which drafts are submitted and awaiting review by AfC
2. The team will develop some initial visual mockups of how the New Pages Feed could change with this work.
3. We will post those mockups here and on the talk pages for the AfC and NPP projects for feedback and input from those communities.
4. Engineering work will begin, and we will strive to roll out incremental improvements for testing as soon as they are ready.
Update 2018-05-10: design session
[edit]On May 8, five of us on the WMF team got together to think through some of the design details around this feature improvement for the New Pages Feed. We talked through the user stories listed above on this page with two objectives: (1) surface as many unanswered questions as we could, and (2) make sure our designer had enough clarity to begin to mock up some ideas. It's not likely that we'll be able to design or engineer according to all the thoughts below, but I'm posting our full notes so that everyone can follow along with our process. Please comment and chime in on the talk page, especially regarding the "Open questions" list at the bottom. We hope to show some mockups for feedback in the next couple weeks.
- Sorting/filtering
- We want to look into allowing the feed to list both drafts that are and are not submitted for review. The former use case would primarily be for deleting drafts that are copyright violations.
- We want to look into a "sort by random" option so that reviewers can choose random pages.
- We will try to mock up an interaction where the reviewer selects whether they are doing "New Page Review" or "Articles for Creation", which would automatically set certain settings, and make irrelevant ones disappear.
- We want to consider two concepts to get feedback from reviewers:
- Surfacing straightforward checkboxes for all the data elements that reviewers can choose, i.e. all four draftquality categories and all six wp10 (quality assessment) categories.
- Phrasing the options more like sentences, e.g. "Only those drafts likely to be vandalism".
- Because we'll be adding sorting options beyond just "Newest" and "Oldest", we'll need to add dropdown box for sorting in the Special:NewPagesFeed.
- We do not think reviewers will need to sort by numeric ORES scores after filtering to an ORES category.
- Listing
- We will find out whether we can use the API of Earwig's Copyvio Detector, and if so, think about how to link reviewers out to that tool to dig into copyvio details.
- We'll try to keep drafts under review out of the New Pages Feed upon load/refresh. (We won't be able to make them dynamically disappear from your screen when someone else puts them under review after you have loaded the page.)
- We need to decide how the blue "review" button should behave for AfC. (This button does the same thing as clicking on the page title.) Perhaps there are good designs that don't have a blue button, and we can try multiple concepts.
- We need to decide what to do with the icons on the left for AfC (e.g., the trash can icon for pages that have been nominated for deletion). Perhaps we can make icons that point out copyvio violation or different ORES categories.
- Other
- We'll want to refer to the two ORES models with more descriptive names than "draftquality" and "wp10".
- We will look into the technical feasibility of rescoring articles with ORES and copyvio on every edit, but will need to keep an eye on whether this overloads their respective APIs.
- Open questions for community members interested in AfC and NPP
- We have an open question around how common it is for drafts to get created in User space, as opposed to Draft space. Will AfC reviewers commonly need to review in the User space?
- We talked about whether these new sort options would make it more likely that AfC reviewers attempt to review the same drafts at the same time. There's been some discussion on this project's talk page about that, but we wanted to hear more. How often do two reviewers attempt the same draft at the same time, and what is that experience like?
- The explanatory text (and any related references on other pages) that shows at the top of the New Pages Feed will likely need to be updated to reflect the additional future uses of the feed. When would it make sense to start that discussion? Perhaps in a few weeks, when it's clearer what will change?
- Is there an existing sense of what copyvio score a reviewer would consider to be high enough to investigate? If there is a clear threshold, we may be able to include that as a quick filter in the New Pages Feed, in addition to allowing it to be sorted by score.
Please remember that these ideas have not been thoroughly vetted for engineering feasibility yet. We won't be able to do everything in this list this time, and we may or may not be able to do all of the most interesting ideas.
- Please see talk page for comments.
Update 2018-05-17: wireframes and copyvio
[edit]Over the last week, the WMF team has been following the execution plan above by working in two tracks:
- Front-end: We have been thinking through the design of how the front-end of the New Pages Feed would be changed to include pages from Draft space, quality scores, and copyvio scores.
- Back-end: We have been doing the investigations of how to technically implement these changes on the back-end. This has three parts (with links to their Phabricator tasks):
Below, I have notes and questions for community members on both these tracks, as well as our current next steps.
Front-end
[edit]Working from the notes from the design session above, and from the resulting conversation on the talk page, we made some low-fidelity wireframes of how we think the expansion of the New Pages Feed could look. I hope everyone can take a look at these images (and the corresponding annotations), and think through whether this will work for your NPP or AfC reviewing workflow – and if not, how the design could be better (they're just wireframes, after all, so nothing is set in stone).
Some overall notes:
- For the sake of simplicity, the wireframes don't show the whole New Pages Feed – but rather include detail only in those places where we need feedback the most. That said, we don't have plans to make changes to any of the elements of the New Pages Feed that are not shown in the wireframes. In fact, we want to avoid disrupting existing New Pages Feed workflows as much as possible.
- Though the wireframes show copyvio scores and the ability to sort and filter on them, copyvio is currently our stickiest technical investigation with the most open questions – so those parts of the design are the most likely to need changes. See the Back-end section below for more details.
- We're optimistic on our technical ability to make it possible to sort drafts by their date of submission (rather than the date of creation), but not 100% sure on that.
- Similarly, we're not completely sure about the ability to sort by "Random" – and it looks like some reviewers feel this is not a very important feature. We're interested in more thoughts on whether that capability would be important and how you would expect it to behave. Perhaps the existing "Find a random AfC submission" button serves this purpose.
- In addition to overall reactions to the wireframes, we are specifically looking for feedback on the distinction between "concept A", which provides more granularity, and "concept B", which provides more structure. See the annotations below for details on the distinction.
-
AfC list
-
AfC menus: concept A
-
AfC menus: concept B
-
NPP menus: concept A
Annotations to the wireframes:
- AfC list. This shows the New Pages Feed with two changes.
- There is a new toggle for choosing between "New page patrol" and "Articles for Creation". Selecting "Articles for Creation" would restrict to Draft space, and enable different filtering and sorting options.
- The list of pages now contains quality, classification, and copyvio values.
- AfC menus: concept A. This shows the feed with its filtering and sorting menus open.
- In this concept, reviewers select specific quality, classification, and copyvio values.
- Reviewers also choose between four different draft "states" to filter the list (with "Awaiting review" being the default).
- Reviewers can narrow to drafts above or below a copyvio score.
- Reviewers can sort the list by most or least recent submission date to AfC.
- AfC menus: concept B. This is another take on the same menus in concept A.
- In this concept, instead of selecting specific quality, classification, and copyvio values, the reviewer would choose from a more structured set of reviewing options (which would not necessarily be the ones currently listed in the wireframe).
- This concept might make it simpler, quicker, and easier for reviewers to find groups of articles that interest them – one click to get what you want.
- NPP menus: concept A. This shows the situation for NPP review.
- All existing filtering options are available, plus the additional quality, classification, and copyvio options.
- This image uses the detailed style from "concept A", but the same approach from "concept B" could be applied here, if that is preferred.
Back-end
[edit]The full detail of the technical investigations can be found in the Phabricator tasks, but a summary is included here.
The biggest challenge is around the best way to incorporate copyvio scores. The engineers here are still discussing and thinking through options, so we unfortunately do not have a complete plan right now. We are eager to hear anything that community members can add to the mix to help us define a path forward. There are three main issues:
- Usage of existing tools: Although Earwig's Copyvio Detector and CopyPatrol are two working and valuable tools for copyvio, we will unfortunately not be able to simply incorporate their APIs or their code directly into the New Pages Feed. As much as we'd like to take advantage of that existing work, the issue is that the New Pages Feed is part of the MediaWiki software, and we are bound by the general rule of not depending on external tools within MediaWiki, because of the risk that adds to the New Pages Feed breaking if the external tools break. This means that to implement copyvio for New Pages Feed, we'll need build similar functionality to those existing tools, as opposed to inheriting what those tools already do.
- Services: There are two main services that the existing copyvio tools use. Earwig's tool uses both the Google API and Turnitin, and CopyPatrol uses only Turnitin. If we build new copyvio functionality, we'll need to decide which service(s) to use. There are pros and cons to both services. Google thoroughly checks the whole internet, while Turnitin's scope is more narrow. But on the other hand, Google requires much more coding logic to implement because its API is not built specifically for copyright searches, whereas Turnitin's API is built for exactly that purpose.
- Usage limitations: Both Google and Turnitin are services with limits, in that WMF only has a certain number of credits with which to run searches. Earwig's tool has historically hit limits (and therefore stopped working for the rest of the day), sometimes multiple times per week – though some recent changes may help. Adding additional load to either of those services through New Pages Feed means we'll need to (a) be conscious of how often we are scoring and rescoring pages with copyvio, and (b) potentially open discussions with those outside services about our limits.
Given these three issues, we currently think that the quickest path to adding copyvio to New Pages Feed would be through using Turnitin, which is the more straightforward service to implement. In that vein, there are several open questions for community members that we need some guidance on:
- For those of you who have experience with both the Google and Turnitin services, what are your thoughts on their pros and cons for copyvio? It seems to us that Turnitin is more reliable for positive identification than negative identification – in other words, if Turnitin does not find anything, that does not mean it's not a violation.
- If the New Pages Feed were sortable/filterable by copyvio scores, what would the next step for a reviewer be upon finding pages with high scores? Is the next step always to look them up in Earwig's tool? Are there any other workflows? I know this has been discussed on the talk page already, but I do want to get some more opinions.
- We're thinking about how frequently to re-score pages as they change. Given that we want to apply copyvio to all pages in the feed so that they can be sortable/filterable, that's many thousands of pages. Re-scoring them each on every edit would almost certainly be too frequent given our limitations. We're thinking that better approaches might be to re-score pages after a certain amount of the page has been changed, or re-score them on a certain schedule. That means that at any given time, the copyvio score would be "approximate". Does this sound sufficient for the purpose of finding a page you would like to review? Or are there other advantageous approaches?
Next steps
[edit]Despite the challenges with copyvio described above, we are moving forward on several fronts. Our expectation is that we'll deliver the changes to the New Pages Feed in segments, as opposed to all at once. I'm going to be making Phabricator tasks for several of the following items.
- We are going to begin the back-end work on making it possible for Draft pages to appear in the New Pages Feed.
- We are going to begin the back-end work of scoring drafts with ORES models for quality.
- We are going to continue the investigation on the right way to incorporate copyvio, taking into account community reactions to the above points.
- We are going to listen to reactions to the wireframes and change the designs as needed.
Thank you, and as always, I'm looking forward to the discussion on the talk page.
Update 2018-05-29: Phabricator tasks created and under development
[edit]For those of you who are interested in following along in Phabricator, I've made tasks for the parts of this project that have reached a level of clarity such that the WMF team can start their engineering work. Thanks to all members of the reviewing community who helped us get to this level of clarity. To that end, we're now excited to say that engineering work is happening over the next four weeks on the following specific work items (all listed under the same epic):
- New Pages Feed: add drafts to the feed (1.1)
- New Pages Feed: filter to draft states (1.2)
- New Pages Feed: submitted date sorting option for drafts (2)
- New Pages Feed: generate ORES scores (3.1)
- New Pages Feed: filtering on ORES scores (3.2)
The copyvio parts of this project do not yet have Phabricator tasks created. Since that is such an important part of the work, and the largest engineering challenge, the conversation is still ongoing around the right way to implement it. I'm hoping we'll be able to come to some conclusions and be able to create those Phabricator tasks by next week.
These tasks still have a fair number of open questions in them that I'll be working to settle down this week, and about which I will likely have questions on the talk page. And the tasks may split, merge, or get increased details over the coming weeks. That said, if anyone is interested in reading over them, I know the team here welcomes questions, clarifications, and especially to let us know if we've misunderstood something important about the AfC and NPP workflows.
Importantly, we will be testing and asking for feedback on these changes to the New Pages Feed before they are rolled out to users. We definitely do not want to change the feed in ways that are surprising or unproductive. It's my intention to give as many visual progress reports as possible as this work unfolds, and to collaborate with the NPP and AfC communities to roll out changes deliberately.
In that vein, I'm hoping that it will be possible to roll out parts of the project to reviewers as they are complete, instead of waiting for all the parts to be done. That way, we'll be able to get real feedback early, and hopefully deliver goodness on this project as soon as possible.
Update 2018-06-07: work continues on adding drafts to the feed
[edit]Over the past week, the Community Tech team has been working on adding draft pages to the New Pages Feed, and making it possible to filter those drafts based on where they are in the AfC process. That work is progressing well so far, without any surprises. Two additional work items that were detailed during the week are the need to explicitly test that drafts are listed in the feed correctly before making changes to production, and making sure that the whole draft backlog is added to the feed.
With respect to the second part of the project -- adding ORES scores to the feed -- Community Tech engineers have talked with engineers on WMF's Scoring Platform team, who are currently working on adding the relevant ORES models to the Mediawiki database to make them easily queryable by features like the New Pages Feed. We're optimistic that that team's work will mean Community Tech will not have to implement something additional to use ORES scores.
And with respect to the third part of the project -- copyvio -- we are taking some time as the engineers work on the beginning parts of this project to think about the right way to implement this capability in the New Pages Feed, taking into account all the thoughts on the Talk page.
Update 2018-06-14: initial screenshot from work in progress
[edit]The Community Tech team has made progress over the past week on adding draft pages to the New Pages Feed. We have been working on the same Phabricator tasks as mentioned in last week's update (phab:T195545 and phab:T195924), and we added one additional task to the to-do list: adding a feature flag. This will allow us to wait until a cohesive set of changes are developed for the New Pages Feed before exposing any of them to reviewers -- as opposed to the feed changing in little, incomplete ways over time.
At this point, we do have a glimpse of how the user interface changes are shaping up -- though it is a still a work in progress. The image below is from a developer's local environment, meaning these changes are not yet available on the actual wikis for anyone to see.
The screenshot shows a couple of important points of progress:
- The new toggle in which a reviewer will select whether they are doing "New Page Patrol" or "Articles for Creation".
- With "Articles for Creation" selected, the feed is restricted to drafts. The four statuses of where a draft can be in its lifecycle are available as filters ("Unsubmitted", "Awaiting review", "Under review", "Declined") as well as an option to view "All" drafts.
It also shows a few things that are incomplete:
- Next to the word "Showing", the filters selected in the menu will be listed, as opposed to saying "reviewed, unreviewed".
- The new sorting options for submission and declined dates have not yet been implemented.
- The developer's environment only has three drafts for testing, as opposed to the thousands that exist in reality.
- Most drafts would not have "No categories".
- The ORES and copyvio work has not yet been undertaken.
Please take a glance at the screenshot and add any of your reactions to this project's talk page. Sometimes seeing something take shape can inspire thoughts that wouldn't have occurred before, and we definitely want to get a sense for whether this feels like it's on the right track.
Update 2018-06-21: now able to filter drafts by state in testing environment
[edit]Over the past week, the Community Tech team has mostly been working on phab:T195924, which is about making it possible to filter drafts by their state ("Unsubmitted", "Awaiting review", "Under review", "Declined", "All"). The team has stood up the software in a testing environment as they develop it, and we have been noting issues in Phabricator as we test out the changing capabilities.
The next major item the team will be working on is making it possible to sort drafts by their submitted and declined dates.
Update 2018-06-28: team update and beginning work with ORES
[edit]There are four main topics in this update:
- Collaboration team involvement
- Current work on sorting options
- Upcoming testing
- Beginning work with ORES
Collaboration team involvement
[edit]We wanted to let everyone know about a team assignment change that will hopefully help this project be completed sooner. So far, the engineering on this project has been done by the Community Tech team. That team has been working on the first major part of the project, which is to add drafts to the New Pages Feed, and make the feed sortable by state and filterable by date. Starting next week, a different Wikimedia Foundation team, the Collaboration team, will be completing the second and third parts of the project, which are adding ORES scores and adding copyvio detection. The Collaboration team was the team that added ORES scores to the Recent Changes feed, giving them good experience using ORES scores and working with the various feeds in Mediawiki. I (MMiller (WMF)) will continue to be the product manager for this work. Community Tech has been doing great work so far, and we're being careful to transfer their knowledge to Collaboration so that the project continues smoothly.
Current work on sorting options
[edit]The last item that the Community Tech team is working on with this project, before the Collaboration team begins their work, is phab:T195547, which will make it possible to sort AfC drafts by their most recent date of submission or most recent date they were declined, in addition to the original date they were created. That is the main work item currently underway this week and next.
Upcoming testing
[edit]Now that the initial work to add drafts to the New Pages Feed is largely complete, we are setting up our ability to rigorously test the new functionality. We are working to surface the new features in the Test Wiki next week. Once the new features are there, we will post another update asking reviewers to try them out and reply with thoughts and bugs. At that point, reviewers who are testing might determine that the simple addition of drafts to New Pages Feed, even without ORES and copyvio, are enough of an improvement that they could be put into production on English Wikipedia.
Beginning work with ORES
[edit]As mentioned above, next week the Collaboration team will begin the work to integrate ORES models into the New Pages Feed. Now that the ORES work will be beginning, I wanted to resurface a previous conversation and a decision we've made about how to proceed. In the project update from 2018-05-17, we posted wireframes for what we called "Concept A" and "Concept B".
- Concept A: reviewers would be able to choose specific "Predicted class" categories (Stub, Start, C-class, etc.) and specific "Predicted issues" categories (Spam, Attack, etc.)
- Concept B: instead of choosing specific categories, reviewers would choose from a smaller set of structured recommended options, e.g. "Likely high quality" or "Likely low quality".
In the discussion on the talk page, some reviewers preferred Concept A because it gives reviewers more control, and some preferred Concept B because it provides clearer recommendations and less opportunity for mis-using the model scores. We have decided to implement Concept A because we believe it will be a good stepping stone to help reviewers figure out whether Concept B is better, and if so, what rules should be used for the structured options of Concept B. In other words, by implementing Concept A, reviewers will have the opportunity to experiment with the ORES scores, decide whether Concept B is preferred, and then develop the rules for it. From the engineering perspective, having implemented Concept A, it will be relatively easy to subsequently implement Concept B.
Please do post on the talk page with reactions, questions, or any other thoughts.
Update 2018-07-05: revisiting copyvio and setting up testing environment
[edit]Now that there are two teams working on this project, Community Tech and Growth, there are a handful of interesting updates:
- The Community Tech team has continued to work on making drafts sortable by their submission and declined dates.
- The Growth team has started work on surfacing ORES scores for "predicted class" and "predicted issues". That work is happening in two Phabricator tasks: one for the back-end, and one for the front-end.
- Both teams are working to set up a useful testing environment so that community members can test out this functionality before it becomes part of English Wikipedia. Although my previous update predicted that would be available this week, it has taken longer than expected to do it right, and the teams are still working on it.
- We talked with the community developers behind CopyPatrol and Earwig's Copyvio Detector to get their perspectives on our best path forward for adding copyvio detection to the New Pages Feed, while staying inside our technical limitations and licensing limits with outside vendors. The notes from those conversations are on this Phabricator task. Next week, the Growth team will incorporate that information as they plan the copyvio phase of this project.
- Some of the code refactoring work that the teams did for the New Pages Feed accidentally caused a bug in which pages created by autopatrolled editors were present in the New Pages Feed, instead of skipping the feed. That bug was reported on Friday, June 29, and was fixed on Monday, July 2.
Update 2018-07-12: ORES work and copyvio planning
[edit]If any AfC or NPP reviewers will be at Wikimania next week, please let me know! I'm hoping to meet some members of this community in person.
We're now formally testing the components of this project in our testing environments. As I've said in previous updates, as soon as we're technically able to do so, I'll ask community members to take some time to test things out as well.
Over the past week, the Community Tech team has continued the work to make drafts sortable by their submission and declined dates. And the Growth team has been writing the code to incorporate ORES scores into the New Pages Feed, and most of that code is now under review before it makes its way to the testing environment.
The Growth team has also learned a lot about using Google and Turnitin for copyvio detection, and has had multiple architectural conversations this week to narrow in on an approach.
Update 2018-07-20: testing, ORES work, and copyvio benchmarking
[edit]Over the past week, the Growth team has finished writing most of the components necessary for applying ORES scores to pages in the New Pages Feed, and along with the filters for the state of drafts, those components are now in our internal testing environments where QA staff are ironing out bugs.
We also conducted a comparison of the two main services that English Wikipedia uses for copyvio detection: Google search (used by Earwig's Copyvio Detector) and Turnitin (used by CopyPatrol). The objective was to help us understand how different the two services are in terms of their results. I'll be assembling the results and posting that in a coming update. We will use that information to help decide which service to use for New Pages Feed, in addition to considerations around the usage limits for those services.
Update 2018-08-06: ready for community testing
[edit]Starting today, everyone is welcome to test out the Growth team's progress on the New Pages Feed using Test Wiki! This has been a long time coming, and our team is excited that you'll be able to get your hands on the work so far. Please make sure to read the "How to test" section below to configure your account. The idea here is that we want to get the reactions and thoughts of AfC and NPP reviewers on an ongoing basis to make sure that we continue to build something useful. Going forward, we'll continue to push updates of the software to Test Wiki for everyone to try out as soon as possible. I'll post here when there is something new to try.
Important notes about testing
[edit]- Rough edges: since this is a testing environment, we'll sometimes be pushing code even before the team's own QA engineer has had a chance to thoroughly test. That means the work will have rough edges, and sometimes bugs will slip through. Please point them out to us so we can fix them! We're hoping that by putting this work in a place where the community can see it earlier, we'll save everyone time in the long run.
- Right now, you'll be able to try out these new capabilities:
- All drafts are in the New Pages Feed under the "Articles for Creation" toggle at the top.
- It is possible to filter to just those drafts of a given state in the AfC process (Unsubmitted, Awaiting review, Under review, Declined).
- It is possible to sort drafts by their "submitted date" and "declined date", in addition to their "created date".
- The New Pages Feed for NPP should behave exactly as usual, with no changes.
- These capabilities are not yet part of the testing environment, but will be in coming weeks:
- Listing the draft's AfC state and dates with its entry in the feed.
- ORES scores for "predicted issues" and "predicted class". This code is mostly written and in the process of being merged and reviewed.
- Copyvio detection. The code here is starting to be written. I will post a separate update about our plans and decisions on this part of the project.
How to test
[edit]- Look at, sort, and filter the New Pages Feed. This can be done even without logging in.
- Try your AfC reviewing workflow with the AFCH gadget. To do this, you will need to do two things:
- Log in to Test Wiki as an autoconfirmed user. I expect many of you who will want to do testing are already autoconfirmed in Test Wiki. If you find that you're not, or you are unable to use the AFCH gadget, let me know on my User talk page, and I will change your user group.
- Turn on the "Yet Another AFC Helper Script" gadget under Preferences --> Gadgets.
- The Test Wiki is currently populated with a couple thousand articles and a few dozen drafts that you'll be able to see in the feed. You are also welcome to create new drafts via the Article Wizard. Note that in Test Wiki, the Article Wizard is found at a different URL than in English Wikipedia.
- Since this is just a testing environment, the content or references in a draft are not important.
Giving feedback
[edit]- You can post any thoughts, ideas, or comments on this project's talk page.
- You can also create Phabricator tasks. If you use Phabricator, you can either tag me or tag the Growth Team so that we see the ticket you create.
- Here are some of the questions we're hoping to learn about from this testing:
- Is this a better way to find and prioritize AfC drafts for review than the current method?
- Are there important parts of the AfC workflow that are not being captured?
- Have there been any undesired changes to the workflow for New Page Patrol?
- Would it be useful to add this capability to the New Pages Feed in English Wikipedia even before the ORES and copyvio elements are ready?
Update 2018-08-08: new date features in test environment
[edit]We've deployed a few changes to the testing environment. Please check them out and let us know what you think!
- Each draft in the feed now lists its state ("Awaiting review", "Declined", etc.)
- Each draft in the feed now lists its "Submitted date" if it is of states "Awaiting review" or "Under review", or its "Declined date" if it is of state "Declined".
- The "Sort by" menu allows sorting by "Submitted date" only when states "Awaiting review" or "Under review" are selected, and allows sorting by "Declined date" only when state "Declined" is selected.
We have not yet done any work on the formatting of the data presented with each draft in its listing in the feed. Because it's a lot of dense information, we would like to hear any suggestions to make it more readable.
Update 2018-08-16: ORES categories now available for testing
[edit]The team has now deployed a major set of work to the testing environment. Please check it out and let us know what you think:
- All pages in the New Pages Feed, whether on the "New Page Patrol" or the "Articles for Creation" side, are being given "Predicted class" and "Predicted issues" categories using ORES models.
- Predicted class: Stub, Start, C-class, B-class, Good, Featured
- Predicted issues: spam, attack, vandalism, no issues
- Those categories are listed in the feed with each page.
- The feed is also filterable by the categories, so, for instance, a reviewer could look only at pages predicted to be spam, or pages predicted to be C-class or better.
- As new edits are saved, the models are re-scored in realtime, so new edits should be reflected immediately.
In order to see how the models change with different content, it can be helpful to paste wikitext from other articles in the Test Wiki, noting in the edit summary which article it came from. Feel free to create new drafts with the Article Wizard, and refer to the "How to test" section above for more details (or ask on the talk page).
A few of notes on outstanding work that we're still doing on this front:
- Capitalizing the categories in the feed
- More clearly displaying the filters selected
- We are also thinking about better ways to make the model categories more scannable and readable in the feed
Update 2018-08-22: Decisions on copyvio
[edit]As the Growth team has been working on adding AfC drafts and ORES to the New Pages Feed (now testable in Test Wiki), we have also been planning how to add the first copyvio detection tool to the New Pages Feed. This post is about our plan to use CopyPatrol (and the Turnitin service) to accomplish this. Read below for the plan and background, and please speak up on the talk page with your thoughts and reactions – the point, after all, is to build something that helps reviewers get their work done. We've also posted the brief statistical analysis our team did as a part of this planning process.
Current plan
[edit]- Every page in the New Pages Feed, including both the NPP and AfC pages, will automatically be checked for copyvio using CopyPatrol, a tool built by WMF's Community Tech team, which in turn uses the Turnitin service.
- If potential violations are found, the page’s listing in the New Pages Feed will be flagged with red text that says “Potential issues: copyvio”.
- That text will link out to the CopyPatrol interface, which shows the text of the page side-by-side with the text from the source where it was potentially copied from. Reviewers can use that interface to look at the text in detail and decide whether there really is a violation.
- If the page is edited with a revision over a certain number of bytes, the revision will be checked and its indicator in the feed will be updated. Essentially, that indicator will mean “one of this page’s revisions had a potential violation”.
- You can still use Earwig's Copyvio Detector for additional checks, in exactly the same way that reviewers have used it for years.
Below is a quick mockup of what this might look like.
You can see that in the third draft in the list, next to "Possible issues", "copyvio" is listed in red. This word is a link to the CopyPatrol interface, where reviewers can investigate potential violations. Below is a screenshot from CopyPatrol showing its existing interface.
Background
[edit]Back when we were planning this effort in May, reviewers participating in the discussion seemed to agree that pre-checking pages for copyvio would help increase reviewing efficiency. The idea is that reviewers could quickly find those pages that are most likely to have copyvio problems, and would save time by not needing to wait as a copyvio tool runs for each page that a reviewer works on.
As the Growth team has been working on the other two major parts of this New Pages Feed upgrade (adding AfC drafts, and adding ORES scores), we have simultaneously been debating the right way to approach the copyvio part. This has been difficult, because unlike with ORES, we rely on third-party services for copyvio detection, like Google (via Earwig's Copyvio Detector) and Turnitin (via CopyPatrol). Integrating third-parties into the Mediawiki software adds technical complexity and risk to our software, since we won’t be able to completely control the services that we’ll be relying on.
We have put a lot of thought into this, and we’ve decided to add copyvio detection in the New Pages Feed using CopyPatrol / Turnitin. The main alternative we considered is Earwig's Copyvio Detector / Google. There are three main reasons we have decided to build with CopyPatrol / Turnitin.
- Performance: our team analyzed the performance of the two services, and we did not find evidence that one service is better at detecting copyvio than the other (though deeper analysis would likely shed more light on the question). The pages they flag are somewhat, but not highly correlated, suggesting that in the long run, the two services may be complementary for finding copyvio – though integrating with both is out of the scope of this project. To read our analysis in depth, please see this page.
- Technical: Turnitin’s API is built specifically for checking for copyright issues, making it straightforward to work with. In other words, all one needs to do with Turnitin's API is send it the text on a page, and it tells you what percent of that page has been found elsewhere on the internet or in its databases. We will also be able to easily integrate with the existing bots and interfaces underlying CopyPatrol. These things together mean that we’ll be able to deploy something useful in a much shorter timeframe than if we were working with Earwig's Copyvio Detector / Google.
- Resources: Because CopyPatrol already checks every substantial edit in English Wikipedia, including new page creations, integrating with the New Pages Feed will add no additional load to our Turnitin credits. We’ll just be surfacing in the New Pages Feed those instances that CopyPatrol is already finding. This would not be the case with the Earwig / Google tool, in which we would be substantially taxing our Google credits and limits.
Additional details
[edit]- When our team was first learning about Turnitin, we thought that Turnitin only compared pages to academic journals and things of that nature. We've learned that it actually does compare pages to websites, and even to archived websites that no longer exists. This contributes to its high coverage.
- Because CopyPatrol checks new revisions above a certain size, the New Pages Feed will flag a page if any of its revisions have had copyvio. That means that if the violating text is removed from the page, it will still be flagged in the feed. We are hoping this is not an issue because such a page, having had its violating text removed, would also be patrolled in the same session and therefore no longer be in the feed.
- We are building the underlying architecture here so that other copyvio services could be plugged into it in the future (such as Earwig's / Google). Though using more than one service is out of scope for this project, the technical components will be in place to make it possible at some other point.
Update 2018-08-30: target date for first upgrade to production is September 17
[edit]Now that reviewers have had a few weeks to test out the changing New Pages Feed in Test Wiki, we want to get some of the improvements out into the real world so they can help reviewers. Specifically, we're planning on deploying the first of the three parts of this project to English Wikipedia on September 17: adding the "Articles for Creation" side to the feed. AfC reviewers would then be able to browse drafts in the feed, filter on their states, and sort by submitted and declined dates. This would leave the classic NPP workflow unchanged, except for the toggle button for "AfC".
As community reviewers tested the feed in Test Wiki, a couple of bugs and ideas were surfaced that our team has largely addressed, and it does not seem like there are major blocking issues. That said, we know that there is more to making a new feature successful than simply flipping it on. These are some of the things that I think would be good to address, and I'm looking for thoughts from reviewers about how best to them:
- Training and notifying AfC reviewers on using the "AfC" side of the New Pages Feed.
- Altering the text at the top of Special:NewPagesFeed to accurately reflect that it is used for multiple purposes now.
- Updating help documentation and screenshots in the NPP and AfC projects.
I am happy to help with any of the documentation or screenshots.
And then following September 17, here are some tentative dates for rolling out the second and third parts of this project (these may change, but give a sense of the pace of our work):
- October 1: adding ORES models
- October 15: adding copyvio
Let's discuss on the talk page if there are any concerns, and what the correct order of operations is here so that we can start getting the useful new features into the hands of reviewers!
Update 2018-09-06: copyvio detection ready for testing
[edit]A couple weeks ago, we posted above on our plans for integrating copyvio detection to the New Pages Feed. It seemed like the plan made sense to reviewers who read it, and so we've implemented it in Test Wiki so that reviewers can try it out. Please check it out and let us know what you think.
Here's how it works:
- CopyPatrol checks all diffs in the Article and Draft spaces (including the first revision of a page) that have 500 bytes or more (excluding wikitext markup). 500 bytes is about three sentences.
- For any pages that have a diff that got flagged by CopyPatrol, that page will say "Potential issues: Copyvio" in the New Pages Feed, which will include a link to CopyPatrol. The feed can be filtered to just those pages with the flag.
- Reviewers can click on that link to inspect the potential violation, and to see whether it has already been resolved by someone else doing CopyPatrol work.
Some notes:
- CopyPatrol does not scan pages in Test Wiki. So to simulate how this will work in English Wikipedia, the six pages flagged as "Potential issues: Copyvio" in Test Wiki are actually linking to CopyPatrol for pages in English Wikipedia that have the same names. Unfortunately, although it was possible to do so with ORES models, it will not be possible for reviewers to tweak the content in Test Wiki pages in order to get a sense of how CopyPatrol flags pages. But you can look at the six English Wikipedia pages to see the actual diffs that were flagged.
- When a page has the "Potential issues: Copyvio" flag in the feed, it means that at least one of the substantial (over 500 bytes) revisions to the page has been flagged by CopyPatrol at some point.
- When issues have been resolved in CopyPatrol, the indicator will not disappear from the feed. Once a page has been flagged for potential copyvio in the feed, it will stay that way.
- If a page has very little content (under 500 bytes), it will not get scanned by CopyPatrol. If this seems to be problematic, we can discuss (along with the CopyPatrol community) altering the threshold.
- CopyPatrol does not scan User space pages, so no User space pages will have the "Potential issues: Copyvio" in the feed.
Update 2018-09-27: deployment schedule update
[edit]A previous update laid out our team's schedule for deploying the three parts of the new feature set to the New Pages Feed in English Wikipedia. This is an update on how deployment has gone so far and what the schedule holds going forward.
As planned, we did deploy AfC to the New Pages Feed on September 17. The reason we've not announced that the feed is ready to be used by AfC reviewers is that we discovered a set of bugs and issues that we've been fixing since that date. Since AfC reviewers already have a functioning workflow, we would prefer that they try out a well-functioning new workflow rather than a buggy new workflow, and so I have not yet declared victory at the AfC discussion page. I expect that the feed will be in shape for that at the beginning of next week, and at that time, I'll include brief instructions for how to use the feed for AfC review. However, for those of you who have been following along on this project, you can see that the New Pages Feed now has its "Articles for Creation" side. We are still fixing a few UI bugs having to do with the sorting and filtering menus, but you are welcome to try it out.
Here's what's coming up:
- October 1 or 2: final UI bugs fixed in New Pages Feed having to do with adding AfC. AfC community can then start using the feed. I will post here and at AfC talk when this is ready.
- October 4 or the beginning of the following week: ORES scores added to both the NPP and AfC sides of the feed.
- October 15 or that week: copyvio detection added to both the NPP and AfC sides of the feed.
For those interested, here are the details on the first deployment and its challenges:
After deploying the feature itself, we needed to spend a few days actually populating the feed with the 40,000+ drafts in English Wikipedia, along with their states ("Awaiting review", "Declined", etc) and their submitted and declined dates. Those dates have proved to be difficult, because the Mediawiki database does not retain a record of when templates and categories were applied to pages. We've approximated the submitted and declined dates using the most recent edit date, but we have a task open to make them more accurate if reviewers find that the dates that are currently in the feed are not close enough for their work. We are still in the process of fixing a couple of UI bugs having to do with the default and sticky values for sorting and filtering selections: T205168 and T205324.
Update 2018-10-01: AfC reviewers can now use the New Pages Feed!
[edit]We fixed the bugs that I mentioned in the previous update, and so now the New Pages Feed is ready to be used for AfC review! Here is the announcement on the AfC discussion page. This is a great milestone for this project -- it's our first of three releases (ORES and copyvio are the next two) and will hopefully give AfC reviewers a tool that helps them prioritize their work, and ultimately get high-quality drafts into the article space faster. Thank you all for weighing in, following along, and helping us get to this point!
Our team is quickly turning our full attention to adding ORES scores to the feed (for both AfC and NPP) this Thursday, October 4, or in the days that follow. I will post additional updates as that initiative unfolds. As always, please comment on the talk page with any thoughts! -- MMiller (WMF) (talk) 20:52, 1 October 2018 (UTC)
Update 2018-10-05: ORES now added to the New Pages Feed
[edit]Our deployment to add two sets of ORES scores to the New Pages Feed went smoothly yesterday. All pages have scores for both models, and they seem to be the scores that we expect them to have. We've posted here at AfC talk and here at NPP talk to announce and explain the changes. In terms of the objectives of this project, we're excited about this deployment because AfC reviewers will now be able to do things like filter the New Pages Feed to just those drafts that are predicted to be "B-class" and above, thereby accelerating the rate that high quality content makes it to the article namespace, and potentially accelerating how quickly a good-faith newbie gets some positive feedback.
To see interesting counts of how many pages in the feed ended up with which predictions, feel free to check out this Phabricator task.
Our team will now turn our attention to the third, and final, part of this project: adding copyvio detection to the feed. I will be back with more updates as we plan for that deployment the week of October 15 or October 22.
Update 2018-10-17: Copyvio now available for testing
[edit]Over the last two weeks, the team has been working to deploy copyvio detection, the third and final component of this project, to English Wikipedia. This has involved the community-driven bot approval process for a modification to EranBot 3, which backs CopyPatrol and which now serves information to the New Pages Feed. This process precipitated some changes to the software, and the creation of a new log, which shows exactly which pages and revisions are being flagged as potentially violating.
The new feature is now deployed for a trial period in which NPP and AfC reviewers can try out the feature at this special URL (as opposed to the regular URL for the New Pages Feed). This trial period will last into next week, at which point the team will decide when to release the feature at the regular URL. The ability to test has been announced here at AfC talk and here at NPP talk. We'll expect to receive feedback at those talk pages, or on this project's talk page. When the feature is released, we'll post recommendations for how to use it.
Update 2018-10-30: Copyvio detection now added to the New Pages Feed
[edit]Yesterday, the team deployed copyvio detection via CopyPatrol to the New Pages Feed. This is the third and final component of this project (along with adding drafts to the feed, and adding ORES scores to the feed). The trial period for the bot that backs this feature went well, and the results of that bot trial are archived here. Only one minor issue was discovered during the trial period.
Our testing shows that all that pages in the New Pages Feed that have been flagged by CopyPatrol are also flagged in the New Pages Feed, with links that go between them. Reviewers will now be able to use this information to further prioritize and triage pages waiting for NPP and AfC review in the feed.
We've posted here at AfC talk and here at NPP talk to announce and explain the changes.
Since this is the final component of this project, we're going to keep an eye on it for the next week to make sure everything continues to work as expected. Then we will post to wrap-up the project.
Update 2018-12-06: project wrap-up and final post
[edit]Now that all the components of this project have been in production for a month without any unsolved issues, it is time to wrap this project up. The effort to improve the efficiency of the Articles for Creation process began in April 2018 and the work was completed in October 2018, seven months later. During the design process, the project expanded from only relating to Articles for Creation to also involving the New Page Review process through work on the New Pages Feed. We were in close contact with both communities throughout the process, and had great discussions where important consensus was built. We're grateful for the community members who tested the software at every step of the development process so that we could be confident that we were building something valuable.
Along the way to adding AfC, ORES, and copyvio detection to the New Pages Feed, we also fixed many bugs in the feed and improved existing components of the feed to make more sense for the contemporary reviewing processes. This mediawiki.org page has been updated to document the 2018 improvements.
Thank you to the AfC and NPP community members who spent their volunteer time thinking about this project and helping us build it.